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Abstract 

The problem of finite-dimensional asymptotics of infinite-dimensional dynamic 
systems is studied. A non-linear kinetic system with conservation of supports for 
distributions has generically finite-dimensional asymptotics. Such systems are ap- 
parent in many areas of biology, physics (the theory of parametric wave interaction), 
chemistry and economics. This conservation of support has a biological interpreta- 
tion: inheritance. The finite-dimensional asymptotics demonstrates effects of "nat- 
ural" selection. Estimations of the asymptotic dimension are presented. After some 
initial time, solution of a kinetic equation with conservation of support becomes a 
finite set of narrow peaks that become increasingly narrow over time and move in- 
creasingly slowly. It is possible that these peaks do not tend to fixed positions, and 
the path covered tends to infinity as t — > oo. The drift equations for peak motion 
are obtained. Various types of distribution stability are studied: internal stability 
(stability with respect to perturbations that do not extend the support), external 
stability or uninvadability (stability with respect to strongly small perturbations 
that extend the support), and stable realizability (stability with respect to small 
shifts and extensions of the density peaks). Models of self-synchronization of cell di- 
vision are studied, as an example of selection in systems with additional symmetry. 
Appropriate construction of the notion of typicalness in infinite-dimensional space 
is discussed, and the notion of "completely thin" sets is introduced. 
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1 Introduction: Unusual conservation law 



There are three geometrically distinguished classes of dynamical systems originating from 
natural sciences: 

• Hamiltonian systems; 

• Dissipative systems with entropy (or another thermodynamic Lyapunov function); 
and 

• Systems with inheritance. 

Hamiltonian systems originated from mechanics. Symplectic geometry followed them. 
Dissipative systems with thermodynamic Lyapunov functions arose from thermodynam- 
ics and kinetics. The related geometry is the geometry of Legendre transformation and 
contact structures (this subject is not exhausted yet; see for example [Hl[28] and a recent 
publication [36]). Systems with inheritance are emerging from population dynamics, phys- 
ical kinetics, turbulence theory, and many other fields of natural science. The geometrical 
sense of inheritance is a special conservation law, in which the conserved "quantity" is a 
set, a support of distribution. 

In the 1970s to the 1980s, theoretical work developed another "common" field simul- 
taneously applicable to physics, biology and mathematics. For physics it is (so far) part 
of the theory of a special kind of approximation, demonstrating, in particular, interesting 
mechanisms of discreteness in the course of the evolution of distributions with initially 
smooth densities. However, what for physics is merely a convenient approximation is a 
fundamental law in biology: inheritance. The consequences of inheritance (collected in 
the selection theory [T51I3EII5Z1IEICSII21I221I1QII2SIEIIIS1) give one of the most important 
tools for biological reasoning. 

This paper is not a review of the scientific literature on evolution, and we mention 
here only references that are particularly important for our understanding of the selection 
theory and its applications. 

Consider a community of animals. Let it be biologically isolated. Mutations can be 
neglected in the first approximation. In this case, new genes do not emerge. 

An example from physics is as follows. Let waves with wave vectors k be excited in 
some system. Denote K a set of wave vectors k of excited waves. Let the wave interaction 
does not lead to the generation of waves with new k ^ K. Such an approximation is 
applicable to a variety of situations, and has been described in detail for wave turbulence 
in [851 [86]. 

What is common in these examples is the evolution of a distribution with a support 
that does not increase over time. 

What does not increase must, as a rule, decrease, if the decrease is not prohibited. This 
naive thesis can be converted into rigorous theorems for the case under consideration [28] . 
It is proved that the support decreases in the limit t — > oo if it was sufficiently large 
initially. (At finite times the distribution supports are conserved and decrease only in 
the limit t — > oo.) Conservation of the support usually results in the following effect: 
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hp x fi. (1) 



dynamics of an initially infinite-dimensional system at t — > oo can be described by finite- 
dimensional systems. 

The simplest and most common class of equations in applications for which the dis- 
tribution support does not grow over time is constructed as follows. To each distribution 
/x is assigned a function by which distributions can be multiplied. Let us write down 
the equation: 

d/i 
dt 

The multiplier is called a reproduction coefficient. It depends on /x, and this depen- 
dence can be rather general and non-linear. 
Two remarks can be important: 

1. The apparently simple form of ([I]) does not mean that this system is linear or even 
close to linear. The operator /x i— > k^ is a general non-linear operator, and the only 
restriction is its continuity in an appropriate sense (see below). 

2. On a finite set X = {xi, . . . , x n }, non-negative measures /x are simply non-negative 
vectors /x^ > (i=l,. . . , n), and ([1]) appears to be a system of equations of the 
following type: 

^ = . . .,/v) x Ik, (2) 

dt 

and the only difference from a general dynamic system is the special behavior of the 
right-hand side of (j2J) near zero values of /ij. 

The right-hand side of ([I]) is the product of the function k^ and the distribution /x, 
and hence d/x/di should be zero when /x is equal to zero; therefore the support of /x is 
conserved in time (over finite times). 

Let us start a more formal consideration, with basic definitions and notations. First, 
we introduce the space of inherited units X. In this paper X is a compact metric space 
with a metric p(x,y). In other special sections we assume that X is a closed bounded 
domain in finite-dimensional real space R n . As a particular case of compact space, a finite 
set X can be discussed. 

We study the dynamics of distributions on X. Each distribution on a compact space 
X is a continuous linear functional on the space of continuous real functions C(X). We 
follow the Bourbaki approach [10J: a measure is a continuous functional, an integral. 
Bourbaki's book jlO] contains all the necessary notions and theorems (and much more 
material than we need here). Space C(X) is a Banach space endowed with the maximum 
norm 

||/||= max \f(x)\. (3) 

If /x G C*(X) is a continuous function and / G C(X), then [/x, /] is the value of /x at a 
function /. 
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Let us mention here two other notations. If X is a bounded closed subset of a finite- 
dimensional space R n , then we represent this functional as the integral 



[/*>/] = / n(x)f(x)dx, 



(4) 



which is the standard notation for distribution (or generalized function) theory. Note 
that here the "density" is not assumed to be an absolute continuous function with 
respect to the Lebesgue measure dx (or even a function), and the notation in Eq. (j3j) has 
the same sense as [/i, /]. If the measure is defined as a function on a a-algebra of sets, 
then the following notation is used: 



We use the notation \p, /] for general spaces X and the representation (j4]) on domains 
in R n without any additional comments. The product k x fi is defined for any k G C(X), 
H G C*(X) by the equality: [k/i, f] = [/i, kf]. 

The support of fi, supp/x, is the smallest closed subset of X with the following property: 
if f(x) = on supp/i, then [fi, f] = 0, i.e. fi(x) = outside supp/x. 

In the space of measures we use weak* convergence, i.e. the convergence of averages: 



for all continuous functions ip G C(X). This weak* convergence of measures generates 
weak* topology on the space of measures (sometimes called weak topology of conjugated 
space, or wide topology). 

Strong topology on the space of measures C*(X) is defined by the norm \\fi\\ = 
sup m=1 [/i, f}. 

Strictly speaking, the space on which /i is defined and the distribution class it belongs 
to should be specified. The properties of the mapping fj, \— > k^ should also be specified, 
and the existence and uniqueness of solutions of ([T]) under given initial conditions should 
be identified. In specific situations the answers to these questions are not difficult. 

The sequence of continuous functions ki{x) is considered to be convergent if it converges 
uniformly. The sequence of measures /i, is called convergent if for any continuous function 
(p the integrals [f^i,f] converge [weak* convergence (JBJ)]. The mapping \i \— > k^ assigning 
the reproduction coefficient k^ to the measure \i is assumed to be continuous with respect 
to these convergencies. 

Finally, the space of measures is assumed to have a bounded set M that is positively 
invariant relative to system ([1]): if fi(0) G M, then /j,(t) G M (we also assume that M 
is non-trivial, i.e. it is neither empty nor a one-point set). This M serves as the phase 
space of system ([T]). (Let us remind that the set of measures M is bounded if the set 
of integrals {[//,/] | // G M, ||/|| < 1} is bounded, where ||/|| is the norm We study 

dynamic of system ([TJ) in bounded positively invariant set M. 

Most of the results for systems with inheritance use a theorem on weak* com- 
pactness: The bounded set of measures is precompact with respect to weak* convergence 




(5) 



Hi -> /i* if and only if [/^, tp] -> [//*, ip] 



(6) 
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(i.e. its closure is compact). Therefore the set of corresponding reproduction coefficients 
k,M = {kfj, | fi e M} is precompact. 

Let us start with the simplest example and the first theorem, and then discuss pos- 
sible interpretations. The simplest example of an emerging discrete distribution from a 
continuous initial distribution gives us the following equation: 



dfi(x, t) 
dt 



b 



fo(x) - / f 1 (x)n(x,t)dx 



fi(x,t), (7) 



where the functions fo(x) and fi(x) are positive and continuous on the closed segment 
[a, b]. Let the function fo(x) reach the global maximum on the segment [a, b] at a single 
point xq. If Xo G supp/i(x,0), then: 

/j,(x,t) — » "{° X ° 5(x — xq), when t — ► oo, (8) 
fi{xo) 

where 5(x — xq) is the 5-function. 

If fo(x) has several global maxima, then the right-hand side of ([8]) can be the sum of a 
finite number of 5-functions. Here a natural question arises: is it worth considering such 
a possibility? Should not we deem it improbable for fo(x) to have more than one global 
maximum? Indeed, such a case seems to be very unlikely to occur. More details on this 
are given below. 

The limit behavior of a typical system with inheritance ([1]) can be much more compli- 
cated than ([8]). Here we can mention that any finite-dimensional system with a compact 
phase space can be embedded in a system with inheritance ([2]). An additional possibility 
for the limit behavior is, for example, the drift effect (Section 13. ip . 

The first step in the routine investigation of a dynamical system is a question about 
fixed points and their stability. The first observation concerning the system ([1]) is that it 
can only be asymptotically stable for steady-state distributions, the support of which is 
discrete (i.e. the sums of (^-functions) . This can be proved for all consistent formalizations 
and can be understood as follows. 

Let U be a domain in X, and the "total amount" in U (integral of J/xj over U) be less 
than e > but not equal to zero. Let us substitute distribution \i by zero on U (the rest 
remains as it is). It is natural to consider this disturbance of fi as e-small. However, if 
the dynamics is described by (jTJ, there is no way back to the undisturbed distribution, 
because the support cannot increase. If the steady-state distribution /i* is asymptotically 
stable, then for some e > any e-small perturbation of \i* relaxes back to fi*. This is 
possible only if for any domain U the integral of \fi*\ over U is either or greater than e. 
Hence, this asymptotically stable distribution /i* is the sum of a finite number of point 
measures: 

q 

i=l 

with |JVi| > e for all i and where 5 Xi is the normalized point measure at point X{. In 
distribution theory notation, this corresponds to the 5-function 8(x — Xj). 
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The perturbation discussed is small not only in the weak* topology, but also in the 
strong sense, and thus it is sufficient to consider strongly small perturbations to prove 
that the asymptotically stable distribution should be discrete. Hence, this statement is 
true if the operator // i— > is continuous for strong topology on the space of measures. 
This is a significantly weaker requirement than being continuous in weak* topology. 

Thus, we have the first theorem. 

Theorem 1 The support of asymptotically stable distributions for the system (TJP is al- 
ways discrete. 

This simple observation has many strong generalizations to general k>limit points, to 
equations for vector measures, etc. 

Dynamic systems in which the phase variable is a distribution \x and distribution 
support is the integral of motion frequently occur in both physics and biology. Because of 
their attractive properties, they are frequently used as approximations: we try to find the 
"main part" of the system in the form of (CD), and represent the rest as a small perturbation 
of the main part. 

Equations in the form of §B) allow the following biological interpretation: /i is the 
distribution of the number (or of biomass or another extensive variable) over inherited 
units: species, varieties, supergenes, genes. Whatever is considered as the inherited unit 
depends on the context and the specific problem. The value of kp(x) is the reproduction 
coefficient of the inherited unit x under given conditions. The notion of "given conditions" 
includes the distribution /i, and the reproduction coefficient depends on \i. Equation (|7j) 
can be interpreted as follows: if fo(x) is the specific birth-rate of the inherited unit 
x (below, for the sake of definiteness, i is a variety, following the spirit of Darwin's 
seminal work [15]), the death rate for the representatives of all inherited units (varieties) 
is determined by one common factor depending on the density fi(x)n(x,t) dx; fi(x) is 
the individual contribution of variety x to this death rate. 

On the other hand, for systems of waves with a parametric interaction, k^(x) can be 
the amplification (decay) rate of the wave with wave vector x. 

Conservation of the support in (pQ) can be considered as inheritance, and we call system 
([1]) and its nearest generalizations "systems with inheritance" . Traditional separation of 
the process of transferring biological information into inheritance and mutations, which 
are small in any admissible sense, can be compared to a description according to the 
following pattern: system (DO) (or its nearest generalizations) plus small disturbances. 
Beyond the limits of such a description, discussion of inheritance loses the conventional 
sense. 

In biology such an approximation is essentially applicable to all classical genetics, and 
to the formal content of the theory of natural selection. The initial diversity is "thinned 
out" over time, and the limit distribution supports are described by some extremal prin- 
ciples (principles of optimality). 

The first study of dynamics systems with inheritance was carried out by J.B.S. Hal- 
dane. He used the simplest examples, studied steady-state distributions and obtained 
the extremal principle for them. His pioneering book "The Causes of Evolution" (1932) 
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[38] gives a clear explanation of the connections between inheritance (the conservation of 
distribution support) and the optimality of selected varieties. 

Haldane's work was followed by an entirely independent series of works on the S- 
approximation in the spin wave theory and on wave turbulence [81)1 [8"B"l 15*3"] , which studied 
wave configurations in the approximation of an "inherited" wave vector, and by "syner- 
getics" [39J, in which the "natural selection" of modes is one of the basic concepts. 

At the same time, a series of works on biological kinetics was completed (see, for 
example [69|[27J[72l|28j). These studies addressed not only steady states, but also general 
limit distributions [27j[28] and waves in the space of inherited units [69]. For steady states 
a new type of stability was described - stable realizability (see below). Many examples 
of ecological applications are collected in reference [72]. The application of optimality 
principles to crop growth simulation is analyzed in reference [65]. Some attempts using 
sociological applications are also known [71"j . 

The Haldane achievements were continued by works on stable evolutionary strategies 
and evolutionary games. In works by Maynard Smith [75J[T6] the "War of Attrition" model 
of animal conflict was introduced and the notion of evolutionarily stable strategy (ESS) 
was defined. This notion was elaborated further in many papers [78], H3j [79] EE EEU 0, SI El 
[8] E] [7J ED E2l [21] . The reader is also referred to a recent review [11], in which evolutionary 
game dynamics is defined as the application of population dynamical methods to game 
theory. (It was invented by evolutionary biologists, but had a great impact on modern 
game theory and economical mathematics.) For some classes of models (a "generalized 
war of attrition" [2]), it was demonstrated that either (i) there is no ESS or (ii) there is 
a unique ESS, which is fully specified. In the case for which only a finite number of pure 
strategies is available, global convergence to the ESS is shown. Of course, for systems with 
inheritance ([T]), more complicated behavior is also possible. In reference [ST] collections 
of subsets that might be supports for ESSs were identified. Imaginary experiments with 
mutant invasion are based on the notion of EES [75, 76J. The dynamical foundation of 
this notion and a dynamical theory of uninvadability in the context of stability theory 
have been developed [1]. It should be mentioned that the analogue of uninvadability, 
external stability, was one of the main notions of the S'-theory in physics [851 E3 [53], and 
ecological applications of this notion were developed reasonably far in the 1970s-1980s; 
see references [2S1 Ell E2] and references therein. 

The dynamics of evolutionary games for the case of a continuum of possible strategies 
has been investigated [8]. The stability properties of stationary points were studied and 
some examples were given. In fact, in reference [8] a particular case of systems with 
inheritance was studied; in this case X is the space of strategies of an evolutionary game. 

The setwise notions of stable evolutionary sets were introduced [79J for evolutionary 
game models in which there is a continuum of equilibrium states, with no state stable in 
itself, but which together are evolutionarily stable. This concept was also analyzed from 
a dynamical point of view [6]. 

Recently, the theory of evolutionary games with a continuum of possible strategies has 
been developed very intensively [7J EE E2] EE] ; see also the review in reference [44J . 
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The first biological applications of systems with inheritance were population dynamics 
and mathematical genetics. All the classical equations for distributions of animals or 
genes have the form ([1]) (or a generalization with additional variables). The space of 
inherited units, X, in these cases has a direct biological interpretation: it is the space 
of inherited variations, or species, or the space of alleles ("variations of genes"). In 
ecological applications it has been proved [72] that the concept of inherited variations of 
animals (without consideration of alleles) gives appropriate accuracy in the problem of 
succession, that is, in the modeling and simulation of changes in a biological community 
under changing conditions. But what is gene? Biology returns to this question again 
and again [25]. The interplay between "units of function" and "units of inheritance" for 
different time scale produces very complicated and fascinating picture. 

Epigenetic inherited units yield many interesting materials for modeling. The source 
of dynamical difference between genetic and epigenetic inheritance is their different time 
scales when they are different |66j. The interaction between these types of inheritance 
could be quite mysterious. For example, the rates and some specific properties of genetic 
mutation processes might be inherited properties, as was discovered for the effect of 
transgenerational instability [20]. This phenomenon is probably due to an epigenetic 
mechanism. 

For the "ecological time scale" , the maternity effect forms another group of inherited 
units. These units are important for the evolution at ecological time-scales [67J. 

The space of inherited strategies provides the interpretation of X in many applications. 
In particular, the selection of strategies of the spatial distribution of individuals has been 
studied [31]. In the case of non- monotonous dependence of the reproduction coefficient 
on the mean population density, a cluster formation was proven. This theory was applied 
to an investigation of the creation of cellular clusters in flow-rate cultivators [33] . 

It is clear that animal migration is not completely random and that it depends on 
conditions; in particular, predator migration depends on space variations of the prey 
density, which might imply interesting dynamical consequences, including changes in the 
number and stability of equilibria and limit cycles [33]. Models of evolutionarily optimal 
migration strategies in prey-predator systems have been studied [70J. A great variety of 
dynamic regimes has been observed, and some of them could be interpreted as outbreak 
explosions. 

The distribution of successors over time (that represent variations of individual devel- 
opment, delayed maturation and even a pause in ontogenesis) is important in the adapta- 
tion to stochastic conditions. Evolutionarily optimal strategies of facultative diapause for 
forest insects have been studied [72] . In [SD] a population with complicated dynamics was 
studied numerically. It was demonstrated that random noise shifts the system towards a 
higher probability of delaying maturation. 

In ecological physiology, the points of X represent strategies of physiological adap- 
tation. A useful notion is the adaptation resource. The presentation of the adaptation 
process as a redistribution of this resource for the neutralization of external factors is 
an effective tool for adaptation modelling [33J. These models of "factors-resource" and 
the dynamic theory of optimal evolutionary strategies allowed us to develop "correlation 
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adaptometry" [TTJ [32] . This method of comparative ecological physiology is now in use 
for comparative analysis of populations and groups for a wide range of organisms, from 
the human population of the Far North [6U [77] to herbaceous species jUJ . 

The purpose of this paper is to present general results for the theory of systems with in- 
heritance: optimality principles for limit distributions, theorems on selection, estimations 
of the limit diversity (estimates of a number of points in the support for limit distribu- 
tions), the drift effect and drift equations. Some of these results have been published in 
preprint [57] and, partially, book form |28j . 

The main benefit of the special form of systems with inheritance is the possibility of 
describing the limit behavior of such systems by avoiding the solution of equations. A 
system of weak and strong optimality principles describes the supports of limit or stable 
distributions. A special drift asymptotic reduces asymptotic behavior at large time values 
by a finite system of ordinary differential equations (ODEs). In subsequent sections this 
technique is developed on the basis of investigation of the general system represented by 
Equation (JIJ). 

The outline of the paper is as follows. In the next section the optimality principles for 
supports of k>limit distributions are developed. These principles have a "weak" form; the 
set of possible supports is estimated from above and it is not obvious that this estimation 
is effective (this is proved in Section [2^41) . 

Minimax estimations of the number of points in the support of cu-limit distributions 
are given in Section 12.21 The idea is to study systems under a e-small perturbation, 
to estimate the maximal number of points for each realization of the perturbed system, 
and then to estimate the minimum of these maxima among various realizations. These 
minimax estimates can be constructive and do not use integration of the system. The set 
of reproduction coefficients (fc(/i) | /i G M} is compact in C(X). Therefore, this set can 
be approximated by a finite-dimensional linear space L £ with any given accuracy e. 

The number of coexisting inherited units ( "quasi-species" ) is estimated from above 
as dimL e . This estimate is true both for stationary and non-stationary coexistence. In 
its general form this estimate was proved in 1980 [27J [28] , but the reasoning of this type 
has a long history. Perhaps, G. Gause [26J was the first to suggest the direct connection 
between the number of species and the number of resources. One can call this number 
"dimension of the environment." He proposed the famous concurrent exclusion principle. 
This principle is often named as the Gause principle, but G. Gause considered his work as 
development of Ch. Darwin ideas of the struggle for existence. This is obvious even from 
the title of his book [26]. More details about early history of the concurrent exclusion 
principle are presented in the review paper of G. Hardin [41J. 

MacArthur and Levins [55J suggested that the number of coexisting species is limited 
by the number of ecological resources. Later [56J , they studied the continuous resource 
distribution (niche space) where the number of species is limited by the fact that the niches 
must not overlap too much. In 1999, G. Meszena and J.A.J. Metz [59] developed further 
the idea of environmental feedback dimensionality (perhaps, independently of [57J, |2"8"]). 
In 2003 [TTJ, the theory of structurally stable stationary coexistence was developed, and 
in 2006 the idea of robustness in concurrent exclusion was approached again, as a "unified 
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theory" of "competitive exclusion and limiting similarity" [58]. All these achievements 
are related to estimation of dimension of the set {k([i) \ n G M} or of some its subsets. 
This dimension plays the role of "robust dimension of population regulation" . 

Section 12.31 contains auxiliary results from functional analysis. Two problems are 
studied: (i) how to describe the sets of global maximum points in a typical compact set of 
continuous functions on a compact metric space; and (ii) how to define the "typicalness" 
in Banach spaces in order to meet intuitive expectations. It is obvious that typicalness in 
the sense of Baire category violates some of the essential requirements of common sense, 
for example even a real line can be divided into a set of the zero measure and a set of 
first category. Hence, a stronger notion is needed. Completely thin sets are introduced 
and the typical properties of compact sets of continuous functions are studied (the sets of 
exclusions are completely thin). 

A theorem of selection efficiency is presented in Section 12.41 The sense of this theorem 
is as follows: for almost every system the support of all cu-limit distributions is small 
(in an appropriate strong sense). Its geometrical interpretation suggested by M. Gromov 
is explained in Section 12.51 Specific entropy-like functions, the decreasing measures of 
diversity, are constructed in Section 12.61 Decreasing of these functions describes self- 
organization. 

The drift equations (Section 13. ip describe the asymptotic behavior of systems with 
inheritance near the limit distributions (when A is a domain in R n , or a manifold). 
That asymptotics proves to be the motion of narrow, almost Gaussian peaks. The drift 
equations are ODEs. In fact, the drift equations substitute the initial infinite-dimensional 
dynamic system ([1]) in many applications: usually the system has enough time to reach 
the drift asymptotic. The bifurcations with change of number of peaks deserve special 
attention: this branching- type" evolution [Tj5], can be related to speciation. 

The simplest model for "reproduction + small mutations" is developed. The limit 
of zero mutations is singular, because arbitrary small (but non-zero) mutations added 
to equation ([I]) destroy dynamical invariance of subspaces {/x | supp/i C Y} for closed 
subsets I'd. Nevertheless, if we consider initial distributions /io with supp/i = A (all 
variability is actually present), then sufficiently small mutations change nearly nothing, 
just the limit 5-shaped peaks transform into sufficiently narrow peaks, and zero limit of 
the velocity of their drift at t — ► oo substitutes by a small finite one in the presence 
of drift effect. Moreover, there exists a scale invariance, and dynamics for large t does 
not depend on nonzero mutation intensity, if the last is sufficiently small: to change this 
intensity, we need just to rescale time. 

Various types of distribution stability are studied in Section 13.31 internal stability 
(stability with respect to perturbations that do not extend the support), external stability 
or uninvadability (stability to the strongly small perturbation which extend the support), 
and stable realizability (stability with respect to a small shift and small extension of 
density peaks). The general condition for stable realizability is the usual ODE Lyapunov 
stability condition with respect to the corresponding drift equations. 

The cell population structured by age (and age-defined variables, size, chemical prop- 
erties, etc.) is studied in Section HI The most restrictive assumption in the model is that 
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all cells have the same cell-cycle period, T. Hence, the cell-cycle phase is the inherited 
variable. Nevertheless, the general results from previous sections cannot be applied to 
this model because of additional symmetry. Direct analysis of the example shows that in 
this case selection is also efficient and the equivariant selection theory is possible. This se- 
lection is an example of self-synchronization. Small deviations from the basic assumption 
lead to smooth self-synchronization waves, and large deviations can destroy the effect. 
In Section [5] a brief description of the main results is presented. 

2 Selection Theorem 

2.1 Optimality principle for limit diversity 

Description of the limit behavior of a dynamical system can be much more complicated 
than enumerating stable fixed points and limit cycles. The leading rival to adequately 
formalize the limit behavior is the concept of the "cu-limit set" . It was discussed in detail 
in the classical monograph [lj . The fundamental textbook on dynamical systems [12] and 
the introductory review [H] are also available. 

Let f(t) be the dependence of the position of point in the phase space on time t (i.e. 
the motion of the dynamical system). A point y is a cu-limit point of the motion f(t), if 
there exists such a sequence of times ti — > oo, that f(ti) — > y. 

The set of all a;- limit points for the given motion f(t) is called the u-limit set. If, 
for example, f(t) tends to the equilibrium point y* then the corresponding ix>-limit set 
consists of this equilibrium point. If f(t) is winding onto a closed trajectory (the limit 
cycle), then the corresponding cu-limit set consists of the points of the cycle and so on. 

General w-limit sets are not encountered oft in specific situations. This is because of 
the lack of efficient methods to find them in a general situation. Systems with inheritance 
is a case, where there are efficient methods to estimate the limit sets from above. This is 
done by the optimality principle. 

Let fi(t) be a solution of ([I]). Note that 

/i(t) = fi(0) exp / k^ T) dr. (10) 
Jo 

Here and below we do not display the dependence of distributions /i and of the reproduc- 
tion coefficients k on x when it is not necessary. Fix the notation for the average value of 
k^ T ) on the segment [0, t] 

If* 

(k^ t ))t = J J Q k n(r) dT - (11) 

Then the expression ffTUl) can be rewritten as 

li{t) = M (0)exp(t(A; Kt) > t ). (12) 

If /i* is the tu-limit point of the solution /x(t), then there exists such a sequence of times 
ti — > oo, that fi(ti) jj* . Let it be possible to chose a convergent subsequence of the 
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sequence of the average reproduction coefficients {k^t^t, which corresponds to times U. 
We denote as k* the limit of this subsequence. Then, the following statement is valid: on 
the support of /i* the function k* vanishes and on the support of /i(0) it is non-positive: 



Taking into account the fact that supp/i* C supp/x(0), we come to the formulation of 
the optimality principle (1130: The support of limit distribution consists of points of the 
global maximum of the average reproduction coefficient on the initial distribution support. 
The corresponding maximum value is zero. 

We should also note that not necessarily all points of maximum of k* on supp/i(0) 
belong to supp/i*, but all points of supp/z* are the points of maximum of k* on supp/x(0). 

If fj,(t) tends to the fixed point //*, then (k^ t )) t — > k^* as t — > oo, and supp/i* consists 
of the points of the global maximum of the corresponding reproduction coefficient k^* on 
the support of //*. The corresponding maximum value is zero. 

If fi(t) tends to the limit cycle fi*(t) (fi*(t + T) = /i*(t)), then all the distributions /i*(t) 
have the same support. The points of this support are the points of maximum (global, 
zero) of the averaged over the cycle reproduction coefficient 



on the support of /i(0). 

The supports of the cu-limit distributions are specified by the functions k*. It is obvious 
where to get these functions from for the cases of fixed points and limit cycles. There are 
at least two questions: what ensures the existence of average reproduction coefficients at 
t — ► oo, and how to use the described extremal principle (and how efficient is it). The 
latter question is the subject to be considered in the following sections. In the situation 
to follow the answers to these questions have the validity of theorems. 

Due to the theorem about weak* compactness, the set of reproduction coefficients 
ku = {kfj, | /i G M} is precompact, hence, the set of averages (fTPj) is precompact too, 
because it is the subset of the closed convex hull conv(fcM) of the compact set. This 
compactness allows us to claim the existence of the average reproduction coefficient k* for 
the description of the a;-limit distribution /i* with the optimality principle (JIB"]) . 

2.2 How many points 

does the limit distribution support hold? 

The limit distribution is concentrated in the points of (zero) global maximum of the 
average reproduction coefficient. The average is taken along the solution, but the solution 
is not known beforehand. With the convergence towards a fixed point or to a limit cycle 
this difficulty can be circumvented. In the general case the extremal principle can be 
used without knowing the solution, in the following way [28J. Considered is a set of all 



k*(x) = if x G supp/i*, 
k*(x) < if x G supp/i(0). 



(13) 




(14) 
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dependencies ji{t) where \i belongs to the phase space, the bounded set M. The set of 
all averages over t is {{k^ t ))t}- Further, taken are all limits of sequences formed by these 
averages - the set of averages is closed. The result is the closed convex hull couv^m) of 
the compact set ku- This set involves all possible averages (Hip and all their limits. In 
order to construct it, the true solution \i{t) is not needed. 

The weak optimality principle is expressed as follows. Let /i(t) be a solution of ([T]) in 
M, /i* is any of its cu-limit distributions. Then in the set conv(/cjv/) there is such a function 
k* that its maximum value on the support supp/io of the initial distribution fi equals to 
zero, and supp/i* consists of the points of the global maximum of k* on supp/^o only ffT3l) . 

Of course, in the set conv(fcj\/) there are usually many functions that are irrelevant to 
the time average reproduction coefficients for the given motion fi(t). Therefore, the weak 
extremal principle is really weak - it gives too many possible supports of /z*. However, 
even such a principle can help to obtain useful estimates of the number of points in the 
supports of tu-limit distributions. 

It is not difficult to suggest systems of the form (pQ), in which any set can be the 
limit distribution support. The simplest example: k^ = 0. Here cu-limit (fixed) is any 
distribution. However, almost any arbitrary small perturbation of the system destroys 
this pathological property. 

In the realistic systems, especially in biology, the coefficients fluctuate and are never 
known exactly. Moreover, the models are in advance known to have a finite error which 
cannot be exterminated by the choice of the parameters values. This gives rise to an idea 
to consider not individual systems ([1]), but ensembles of similar systems [28] . 

Let us estimate the maximum for each individual system from the ensemble (in its 
cu-limit distributions), and then, estimate the minimum of these maxima over the whole 
ensemble - (the minimax estimation). The latter is motivated by the fact, that if the 
inherited unit has gone extinct under some conditions, it will not appear even under the 
change of conditions. 

Let us consider an ensemble that is simply the e-neighborhood of the given system 
([1]). The minimax estimates of the number of points in the support of u;-limit distribution 
are constructed by approximating the dependencies k^ by finite sums 



Here (ft depend on x only, and ipi depend on /i only. Let e n > be the distance from 
k^ to the nearest sum (Tl5l) (the "distance" is understood in the suitable rigorous sense, 
which depends on the specific problem). So, we reduced the problem to the estimation of 
the diameters e n > of the set conv (ku)- 

The minimax estimation of the number of points in the limit distribution 
support gives the answer to the question, "How many points does the limit distribution 
support hold": If e > e n then, in the e-vicinity of k^, the minimum of the maxima of the 
number of points in the u-limit distribution support does not exceed n. 

In order to understand this estimate it is sufficient to consider system ([[]) with k^ of 



n 




(15) 
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the form (1T51) . In this case for any dependence fi(t) the averages (fTTj) have the form 



1 



n 



t 



k^ T ) dr = ip (x) + ^2 Viirfui- 



(16) 



i=i 



where a« are some numbers. The ensemble of the functions ( |T6|) for various a« forms a 
n-dimensional linear manifold. How many points of the global maximum (equal to zero) 
could a function of this family have? 

Generally speaking, it can have any number of maxima. However, it seems obvious, 
that "usually" one function has only one point of global maximum, while it is "improb- 
able" that the maximum value is zero. At least, with an arbitrary small perturbation of 
the given function, we can achieve for the point of the global maximum to be unique and 
the maximum value be non-zero. 

In a one-parametric family of functions there may occur zero value of the global max- 
imum, which cannot be eliminated by a small perturbation, and individual functions of 
the family may have two global maxima. 

In the general case we can state, that "usually" each function of the n-parametric 
family (fT6|) can have not more than n points of the zero global maximum (of course, 
there may be less, and the global maximum is, as a rule, not equal to zero at all for 
the majority of functions of the family). What "usually" means here requires a special 
explanation given in the next section. 

In application is often represented by an integral operator, linear or nonlinear. In 
this case the form (fT5l) corresponds to the kernels of integral operators, represented in a 
form of the sums of functions' products. For example, the reproduction coefficient of the 
following form 



has also the form (ITS]) with ipi(fi) = f gi(y)fi(y) dy. 

The linear reproduction coefficients occur in applications rather frequently. For them 
the problem of the minimax estimation of the number of points in the tu-limit distribution 
support is reduced to the question of the accuracy of approximation of the linear integral 
operator by the sums of kernels-products ffTTl) . 

2.3 Almost finite sets and "almost always" 

In this section, some auxiliary propositions and definitions are presented. The supports 
of the cu-limit distributions for the systems with inheritance were characterized by the 
optimality principle. These supports consist of points of global maximum of the average 
reproduction coefficient. We can a priori (without studying the solutions in details) 




n 




(17) 
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characterize the compact set that includes all possible average reproduction coefficients. 
Hence, we get a problem: how to describe the set of global maximum for all functions from 
generic compact set of functions. First of all, any closed subset M C X is a set of global 
maximum of a continuous function, for example, of the function f(x) = —p(x, M), where 
p(x,M) is the distance between a set and a point: p(x,M) = inf yeM p(x, y) , and p(x,y) 
is the distance between points. Nevertheless, we can expect that one generic function has 
one point of global maximum, in a generic one-parametric family might exist functions 
with two points of global maximum, etc. How these expectations meet the exact results? 
What does the notion "generic" mean? What can we say about sets of global maximum 
of functions from a generic compact family? In this section we answer these questions. 

"Almost always", "typically", "generically" a function has only one point of global 
maximum. This sentence should be given an rigorous meaning. Formally it is not difficult, 
but haste is dangerous when defining "genericity" . 

Here are some examples of correct but useless statements about "generic" properties of 
function: Almost every continuous function is not differentiable; Almost every C 1 -function 
is not convex. Their meaning for applications is most probably this: the genericity used 
above for continuous functions or for C 1 -function is irrelevant to the subject. 

Most frequently the motivation for definitions of genericity is found in such a situation: 
given n equations with m unknowns, what can we say about the solutions? The answer 
is: in a typical situation, if there are more equations, than the unknowns (n > m), there 
are no solutions at all, but if n < m (n is less or equal to m), then, either there is a 
(m — n)-parametric family of solutions, or there are no solutions. 

The best known example of using this reasoning is the Gibbs phase rule in classical 
chemical thermodynamics. It limits the number of co-existing phases. There exists a 
well-known example of such reasoning in mathematical biophysics too. Let us consider a 
medium where n species coexist. The medium is assumed to be described by m parameters 
Sj. Dynamics of these parameters depends on the organisms. In the simplest case, the 
medium is a well-mixed solution of m substances. Let the organisms interact through the 
medium, changing its parameters - concentrations of m substances. It can be formalized 
by a system of equation: 

^ = ki(si, . . . , s m ) x p, { {% = 1, . . . n); 
ds- 

= qj (si, . . . , s m , pi,..., p n ) (j = 1, . . . m), (18) 

In a steady state, for each of the coexisting species we have an equation with respect 
to the state of the medium: the corresponding reproduction coefficient fc« is zero. So, 
the number of such species cannot exceed the number of parameters of the medium. In 
a typical situation, in the m-parametric medium in a steady state there can exist not 
more than m species. This is the concurrent exclusion principle in the G. Gause form 
|26j . Here, the main hypothesis about interaction of organisms with the media is that the 
number of essential components of the media is bounded from above by m and increase 
of the number of species does not extend the list of components further. Dynamics of 
parameters depends on the organisms, but their nomenclature is fixed. 
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This concurrent exclusion principle allows numerous generalizations [55l |56| |52 | [T7 ] [58] . 
Theorem of the natural selection efficiency may be also considered as its generalization. 

Analogous assertion for a non-steady state coexistence of species in the case of equa- 
tions (ITS!) is not true. It is not difficult to give an example of stable coexistence under 
oscillating conditions of n species in the m-parametric medium at n > m. 

But, if ki( Si, . . . , s m ) are linear functions of s\, . . . , s m , then for non-stable conditions 
we have the concurrent exclusion principle, too. In that case, the average in time of the 
reproduction coefficient is the reproduction coefficient for the average state of the medium: 

(ki(si(t), s m (t))) = fcj((si), . . . , (s m )) 

because of linearity. If (xi) ^ then fc i ((s 1 ), . . . , (s m )) = 0, and we obtain the non- 
stationary concurrent exclusion principle "in average". And again, it is valid "almost 
always" . 

The non-stationary concurrent exclusion principle "in average" is valid for linear re- 
production coefficients. This is a combination of the Volterra [82J averaging principle and 
the Gause principle, 

It is worth to mention that, for our basic system (pQ), if are linear functions of fi, 
then the average in time of the reproduction coefficient k^u) is the reproduction coefficient 
for the average n(t) because of linearity. Therefore, the optimality principle f )13p for the 
average reproduction coefficient k*, transforms into the following optimality principle for 
the reproduction coefficient khA of the average distribution (/i) 

k(jj\(x) = if x G supp/i*, 

k{ft)(x) < if x e supp/i(0). (19) 

(the generalized Volterra averaging principle [82J). 

Formally, various definitions of genericity are constructed as follows. All systems (or 
cases, or situations and so on) under consideration are somehow parameterized - by sets 
of vectors, functions, matrices etc. Thus, the "space of systems" Q can be described. 
Then the "thin sets" are introduced into Q, i.e. the sets, which we shall later neglect. 
The union of a finite or countable number of thin sets, as well as the intersection of any 
number of them should be thin again, while the whole Q is not thin. There are two 
traditional ways to determine thinness. 

1. A set is considered thin when it has measure zero. This is resonable for a finite- 
dimensional case, when there is the standard Lebesgue measure - the length, the 
area, the volume. 

2. But most frequently we deal with the functional parameters. In that case it is 
common to restore to the second definition, according to which the sets of Baire 
first category are negligible. The construction begins with nowhere dense sets. The 
set Y is nowhere dense in Q, if in any nonempty open set V C Q (for example, in 
a ball) there exists a nonempty open subset W C V (for example, a ball), which 
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does not intersect with Y: W fl Y = 0. Roughly speaking, Y is "full of holes" - in 
any neighborhood of any point of the set Y there is an open hole. Countable union 
of nowhere dense sets is called the set of first category. The second usual way is 
to define thin sets as the sets of first category. A residual set (a "thick" set) is the 
complement of a set of the first category. 

For the second approach, the Baire category theorem is important: In a non-empty 
complete metric space, any countable intersection of dense, open subsets is non-empty. 

But even the real line R can be divided into two sets, one of which has zero measure, 
the other is of first category. The genericity in the sense of measure and the genericity in 
the sense of category considerably differ in the applications where both of these concepts 
can be used. The conflict between the two main views on genericity stimulated efforts to 
invent new and stronger approaches. 

Systems (P) were parameterized by continuous maps fi i— »■ Denote by Q the space 
of these maps M — > C(X) with the topology of uniform convergence on M. It is a Banach 
space. Therefore, we shall consider below thin sets in a Banach space Q. First of all, let 
us consider n-dimensional affine compact subsets of Q as a Banach space of affine maps 
\& : [0, l] n — > Q (^(ai, ...a n ) = J2i a ifi + <Pi a i e [°> 1]> fii^P e Q) m tne maximum norm. 
For the image of a map ^ we use the standard notation im\l/. 

Definition 1 A set Y C Q is n-thin, if the set of affine maps ^ : [0, l] n — ► Q with 
non-empty intersection im^ H Y ^ is the set of first category. 

All compact sets in infinite-dimensional spaces and closed linear subspaces with codimen- 
sion greater then n are n-thin. If dimQ < n, then only empty set is n-thin in Q. The 
union of a finite or countable number of n-thin sets, as well as the intersection of any 
number of them is n-thin, while the whole Q is not n-thin. 

Let us consider compact subsets in Q parametrized by points of a compact space K. It 
can be presented as a Banach space C(K, Q) of continuous maps K — > Q in the maximum 
norm. 

Definition 2 A set Y C Q is completely thin, if for any compact K the set of continuous 
maps : K — > Q with non-empty intersection im\l/ fl Y ^ is the set of first category. 

A set Y in the Banach space Q is completely thin, if for any compact set K in Q and 
arbitrary positive e > there exists a vector q 6 Q, such that ||g|| < e and K + q does 
not intersect Y: (K + q) fl Y = 0. All compact sets in infinite-dimensional spaces and 
closed linear subspaces with infinite codimension are completely thin. Only empty set is 
completely thin in a finite-dimensional space. The union of a finite or countable number 
of completely thin sets, as well as the intersection of any number of them is completely 
thin, while the whole Q is not completely thin. 

Proposition 1 If a set Y in the Banach space Q is completely thin, then for any compact 
metric space K the set of continuous maps : K — > Q with non-empty intersection 
im\I/ fl Y 7^ is completely thin in the Banach space C(K, Q). 
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To prove this proposition it is sufficient to mention that for any compact P the space 
C(P, C(K, Q)) of continuous maps P — > C(K, Q) is isomorphic to the space C(P x K,Q), 
and P x K is compact again. 

Below the wording "almost always" means: the set of exclusions is completely thin. 
The main result presented in this section sounds as follows: almost always the sets of 
global maxima of functions from a compact set are uniformly almost finite. 

Proposition 2 Let X have no isolated points. Then almost always a function f G C(X) 
has nowhere dense set of zeros {x G X | f(x) = 0} (the set of exclusions is completely thin 



In order to prove this proposition, let us mention that the topology of X has a countable 
base {Ui) ( ^ l . For any % the set of functions 



is closed subspace of C[X) (even an ideal) with infinite codimension. For any compact 
set K C C(X) the sets of shift vectors 



is open and dense in C(X). Hence, the set of shifts fljVi is dense residual set in C(X). 
After combination Proposition [2] with Proposition [1] we get the following 

Proposition 3 Let X have no isolated points. Then for any compact space K and almost 
every continuous map ^ : K — > C(X) all functions f G im\l/ have nowhere dense sets of 
zeros (the set of exclusions is completely thin in C(K,C(X))). 

In other words, in almost every compact family of continuous functions all the functions 
have nowhere dense sets of zeros. 

In construction of "almost finite" sets we follow a rather old idea that was used by 
Liouville in construction of his "almost rational" transcendental numbers [12J. A Liouville 
number is a transcendental number which has very close rational number approximations. 
An irrational number (3 is called a Liouville number if, for each n, there exist integers p 
and q such that 



An example of such a number gives Liouville's constant, sometimes also called Liou- 
ville's number, that is the real number defined by 



n=l 

Liouville's constant is a decimal fraction with a 1 in each decimal place corresponding 
to a factorial, and zeros everywhere else. It was the first decimal constant to be proven 
transcendental. 



in C{X)). 



AmxvlUi = {f G C(X) | f(x) = for all x G U { } 



Vi = {yE C{X) \{K + y)n Annuity = 0} 




oo 
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In some sense, almost all real numbers are the Liouville numbers: the set of all Liouville 
numbers is the residual set. To prove this statement let us enumerate all the rational 
numbers: T\, r 2 , . . ., r n = p n /q n - The following set U e is open and dense on the real line: 

OO f 

n=l ^ 

The intersection of U t for e = 1/2, 1/4, 1/8, ... 

oo 

u = n ^ (2i) 

k=l 

is the residual set, and numbers from U are the Liouville numbers. On the other hand, 
U has zero Lebesgue measure, and it gives us an example of dividing the real line on the 
set of first category (the complement of U) and the set of the zero measure U. 

Let us consider a space of closed subsets of the compact metric space X endowed by 
the Hausdorff distance. The Hausdorff distance between closed subsets of X is 



e 

< — 

ill 



(20) 



dist(A, B) = max{sup inf p(x, y), sup inf p(x, y)}. (22) 

The almost finite sets were introduced in [28] for description of the typical sets of 
maxima for continuous functions from a compact set. This definition depends on an 
arbitrary sequence e n > 0, e n — > 0. For any such sequence we construct a class of subsets 
Y C X that can be approximated by finite set faster than e n — > 0, and for families of sets 
we introduce a notion of uniform approximation by finite sets faster than e n — > 0: 

Definition 3 Let e n > 0, e n — > 0. The set Y C X can be approximated by finite sets 
faster than e n — > (e n > 0), if for any 5 > there exists a finite set Sjv such that 
dist(SN,Y) < Sen. The sets of family Y can be uniformly approximated by finite sets 
faster than e n — > 0, if for any 5 > there exists such a number N that for any Y G Y 
there exists a finite set 5jv such that dist(SV, Y) < 5en- 

The simplest example of almost finite set on the real line for a given e n — > (e„ > 0) 
is the sequence e n /n. If e n < const /n, then the set Y on the real line which can be 
approximated by finite sets faster than e n — > have zero Lebesgue measure. At the same 
time, it is nowhere dense, because it can be covered by a finite number of intervals with 
an arbitrary small sum of lengths (hence, in any interval we can find a subinterval free of 
points of Y). 

Let us study the sets of global maxima argmax/ for continuous functions / G C(X). 
For each / G C(X) and any e > there exists (p G C(X) such that ||/ — <f>\\ < e and 
argmax0 consists of one point. Such a function <f> can be chosen in the form 

0(x) = f{x) + — -i ^ (23) 

1 + p{x,x ) 2 
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where Xo is an arbitrary element of argmax/. In this case argmax0 = {xq}- 

Hence, the set argmax/ can be reduced to one point by an arbitrary small pertur- 
bations of the function /. On the other hand, it is impossible to extend significantly 
the set argmax/ by a sufficiently small perturbation, the dependence of this set on / is 
semi continuous in the following sense. 

Proposition 4 For given f G C(X) and any e > there exists 6 > such that, whenever 
||/ - 0|| < <i, then 

max min p(x, y) < e. (24) 

x'Gargmaxc/) ySargmax/ 

In order to derive the dependence of 5 on e in Proposition HI we can use the following 
auxiliary function: 

r//(r) = min {max/ - f(x) } , (25) 

p(a;,argmax/)>r K X J 

where p(x, argmax/) is the distance between a set and a point. 

The function r)f(r) is monotone nondecreasing, i]f(0) = 0, and 77/ (r) > for r > 0. 
We can take in Proposition @] any 5 < rjf(e), for example, 5 = r)f(e)/2. 

In particular, if the set argmax/ consists of one point xq, then for sufficiently small 
perturbations of / the set argmax remains in an arbitrary small ball near xq. 

These constructions can be generalized onto n-parametric affine compact families of 
continuous functions. Let us consider affine maps of the cube [0, l] k into C(X), $ : 
[0, l] k — > C(X). The space of all such maps is a Banach space endowed with the maximum 
norm. 

Proposition 5 For any affine map $ : [0, l] k — > C(X) and an arbitrary e > there 
exists such a continuous function ip G C(X), that < e and the set argmax(/ + ip) 
includes not more than k + 1 points for all f G im$. 

To prove this Proposition [5] we can use the following Lemma. 

Lemma 1 Let Q C C(X) be a compact set of functions, e > 0. Then there are a finite 
set Y C X and a function 4> G C(X) such that \\(f>\\ < e, and any function f G Q + <fi 
achieves its maximum only on Y : argmax/ C Y . 

To find the shift function <p let us use the following auxiliary functions: for / G C(X) 

7/ (r)= max \f(x)-f(y)\. (26) 

p(x,y)<r 

The function 7/(r) is monotone nondecreasing, 7/(0) = 0, and 7/(r) — > for r — > 0. 
Sometimes one calls it the uniform continuity module of /. For the compact of functions 
Q C C(X) we can also define the uniform continuity module: 

7 Q (r) = maxT^r) = max |/(a;) - f(y)\. (27) 
/e<2 }&Q,p{x,y)<r 
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The function 7g(r) is monotone nondecreasing, 7q(0) = 0, and 7q(V) — > for r — > 0. 
For general compact space X the function 7/(r), 7<g(r) might be not continuous at points 
r 7^ 0. Instead of them we can use their continuous majorants, for example, the concave 
closures of these functions. Hypograph of a function j(r), denoted hyp(7), is the set 
{( r )5 l ) I 9 ^ 7( r )}- Note that the hypograph is the region below the graph of 7. The 
concave closure of 7(7") (denoted as conoy) is the function that has as hypograph the 
closure of the convex hull of hyp (7) (that is, the smallest closed and convex set containing 
hyp(7)). If a function 7(7") on an interval [0, R] is monotone nondecreasing, bounded, 
7(0) = 0, and 7(r) — > for r — > (that is j(r) is continuous at the point r = 0), 
then the function conc7 on the interval [0, R] is continuous, monotone nondecreasing, 
conc7(r) > 7(7*) for all r E [0, R], conc7(0) = 7(0) = and conc7(i?) = 7(-R). 

For given 7 > 0, 7 < max7g(r) we can find r(j), a unique solution of the equation 
conc7 Q (r) = 7. 

For the given e > 0, let us find in X a finite r(e/2)-net {xi,x 2 , ■ ■ ■ ,x^} C X. For 
each Xi we define a e-small "cap function" 

M x ) = \ (| -1Q (^^' Xi ) ? r)) iip(x,Xi) < r 2 , 

</>i(x) = if p(x,Xi) > r 2 , (28) 

where r\ = r(e/2), r 2 = | min |ri, | min^j p(xi, Xj)}. 

If p(x,Xi) < e/2 and f E Q, then \f(x) — f(xi)\ < 4>i(x). If <f>i{x) 7^ then (j)j(x) = 
for all j 7^ i; \\<f>i(x)\\ < e and 4>i(x) > for each i. 

We can define the shift function in Lemma [1] as 

N 

cf)(x)=J2<f>i(x)- (29) 

i=l 

Lemma [1] allows us to reduce some problems concerning global maxima of continuous 
functions from a compact sets Q C C(X) to questions about functions on finite subsets 
in X. In particular, Proposition [5] reduces to a question about existence of non-trivial 
solutions for finite systems of linear equations. Let us consider functions on the finite net 
{xi, x 2 , ■ ■ ■ , Xn} C X. For each affine map $ : [0, l] k — > C(X) ($(«!, . . . a n ) = J2i a ifi + 
ip, oii E [0,1], fi,ip E C(X)) the values $(«i, . . . a n )[xi) are linear (non-homogeneous) 
functions on the /c-cube [0, l] k . For any q different points x it , . . . x iq of the net the system 
of equations 

$(«i, . . . a n )(x h ) = ... = $(«i, . . . a n ){x iq ) (30) 

can, generically, have a solution in the /c-cube [0, l] fc only for q < k + 1. The degenerated 
case with solutions for q > k + 1 can be destroyed by an arbitrary small perturbation. 
This simple remark together with Lemma [1] proves Proposition [5j 

Note, that Proposition[5]and Lemma [1] demonstrate us different sources of discreteness: 
in Lemma [1] it is the approximation of a compact set by a finite net, and in Proposition [5] 
it is the connection between the number of parameters and the possible number of global 
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maximums in a /c-parametric family of functions. There is no direct connection between 
N and k values, and it might be that N ^> k. For smooth functions in finite-dimensional 
real space polynomial approximations can be used instead of Lemma [I] in order to prove 
the analogue of Proposition 

For any compact K the space of continuous maps C(K,C(X)) is isomorphic to the 
space of continuous functions CiK x X). Each continuous map F : K — > C(X) can 
be approximated with an arbitrary accuracy e > by finite sums of the following form 
(k>0): 

k 

F (y)( x ) = ^2oci(y)fi(x) + ip(x) + o, 
i=i 

y G K, x G X, < on < 1, f u cp G C(X), \o\ < e. (31) 

Each set fi,ip G C(X) generates a map $ : [0, l] k — > C(X). A dense subset in the space 
of these maps satisfy the statement of Proposition each function from im$ has not 
more than k + 1 points of global maximum. Let us use for this set of maps $ notation 
Pfc, for the correspondent set of the maps F : K —>■ C(X), which have the form of finite 
sums notation Pf , and P K = U fc Pf . 

For each $ G Pf" and any e > there is S = 5$(e) > such that, whenever ||\E r — <&|| < 
5&(e), the set argmax/ belongs to a union of + 1 balls of radius e for any / G im\l/ 
(Proposition HI). 

Let us introduce some notations: for k > and £ > 

U£ £ = G C(if, C(X)) | |* - $|| < <y«(e) for some $ G Pf }; 
for £j > 0, £j — >• 

oo 

v 5j = U u £^ ; 

fc=0 

and, finally, 

oo 
s=l 

The set P^ is dense in C(i^, C(X)). Any F G P K has the form of finite sum fl3T|) . and 
any / G imF has not more than k + 1 point of global maximum, where is the number of 
summands in presentation ( 13TI) . The sets Vf, are open and dense in the Banach space 
C(K,C(X)) for any sequence £« > 0, Si — > 0. The set W^-. is intersection of countable 
number of open dense sets. For any F G Wf.j the sets of the family {argmax/ | / G imF} 
can be uniformly approximated by finite sets faster than e n — ► 0. It is proven that this 
property is typical in the Banach space C(K,C(X)) in the sense of category. In order 
to prove that the set of exclusions is completely thin in C(K,C(X)) it is sufficient to 
use the approach of Proposition [TJ Note that for arbitrary compact space Q the set of 
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continuous maps Q — > C(K,C(X)) in the maximum norm is isomorphic to the spaces 
C(Q x K, C(X)) and C(Q x K x X). The space Q x if is compact. We can apply the 
previous construction to the space C{Q x K, C(X)) for arbitrary compact Q and get the 
result: the set of exclusion is completely thin in C(K, C(X)). 

In the definition of W?,j we use only one sequence Si > 0, Si — > 0. Of course, for any 
finite or countable set of sequences the intersection of correspondent sets W/~.i is also a 
residual set, and we can claim that almost always the sets of {argmax/ | / 6 imF} can be 
uniformly approximated by finite sets faster than e n — > for all given sequences. Let us 
mention that the set of all recursive enumerable countable sets is also countable and not 
continuum. This observation is very important for algorithmic foundations of probability 
theory [31]. Let I be a set of all sequences of real numbers e n > 0, e n — > with the 
property: for each {e n } e L the rational hypograph {(n, r) | e n > r G Q} (where Q is the 
set of rational numbers) is recursively enumerable. The set 



{ei}eL 

is a residual set again and almost always the sets of {argmax/ | / 6 imF} can be uniformly 
approximated by finite sets faster than e n — > for all given sequences from L. 

2.4 Selection efficiency 

The first application of the extremal principle for the tu-limit sets is the theorem of the 
selection efficiency. The dynamics of a system with inheritance leads indeed to a selection 
in the limit t — > oo. In the typical situation, a diversity in the limit t — > oo becomes less 
than the initial diversity. There is an efficient selection for the "best" . The basic effects of 
selection are formulated below. Let X be compact metric space without isolated points. 

Theorem 2 (Theorem of selection efficiency.) 

1. For almost every system (T7J) the support of any uj '-limit distribution is nowhere dense 
in X (and it has the Lebesgue measure zero for Euclidean space). 

2. Let e n > 0, e n — > be an arbitrary chosen sequence. The following statement is true 
for almost every system (QJ). Let the support of the initial distribution be the whole 
X . Then the support of any u -limit distribution can be approximated by finite sets 
uniformly faster than e n — > 0. 

The set of exclusive systems that do not satisfy the statement 1 or 2 is completely thin. 

These properties hold for the continuous reproduction coefficients. It is well-known, 
that it is dangerous to rely on the genericity among continuous functions. For example, 
almost all continuous functions are nowhere differentiable. But the properties 1, 2 hold 
also for the smooth reproduction coefficients on the manifolds and sometimes allow to 
replace the "almost finiteness" by simply finiteness. 
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To prove the first statement, it is sufficient to refer to Proposition [3j In order to clarify 
the second part of this theorem, note that: 

1. Support of an arbitrary c<>limit distribution fi* consist of points of global maximum 
of the average reproduction coefficient on a support of the initial distribution. The 
corresponding maximum value is zero. 

2. Almost always a function has only one point of global maximum, and corresponding 
maximum value is not 0. 

3. In a one-parametric family of functions almost always there may occur zero values 
of the global maximum (at one point), which cannot be eliminated by a small 
perturbation, and individual functions of the family may stably have two global 
maximum points. 

4. For a generic n-parameter family of functions, there may exist stably a function 
with n points of global maximum and with zero value of this maximum. 

5. Our phase space M is compact. The set of corresponding reproduction coefficients 
ku in C(X) for the given map \i — > is compact too. The average reproduction 
coefficients belong to the closed convex hull of this set conv(fcAf). And it is compact 
too. 

6. A compact set in a Banach space can be approximated by compacts from finite- 
dimensional linear manifolds. Generically, the function from such a compact can 
have not more than n points of global maximum with zero value, where n is the 
dimension of manifold. 

The rest of the proof of the second statement is purely technical. Some technical 
details are presented in the previous section. The easiest demonstration of the "natural" 
character of these properties is the demonstration of instability of exclusions: If, for 
example, a function has several points of global maxima, then with an arbitrary small 
perturbation (for all usually used norms) it can be transformed into a function with the 
unique point of global maximum. However "stable" does not always mean "dense" . The 
discussed properties of the system flTJ are valid in a very strong sense: the set of exclusion 
is completely thin. 

2.5 Gromov's interpretation of selection theorems 

In his talk [37], M. Gromov offered a geometric interpretation of the selection theorems. 
Let us consider dynamical systems in the standard m-simplex a m in m + 1-dimensional 
space R m+1 : 

m+1 

a m = {x e R m+1 | Xi > 0, £z< = l}. 

i=l 
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We assume that simplex a m is positively invariant with respect to these dynamical sys- 
tems: if the motion starts in o~ m at some time to, then it remains in a m for t > to- Let us 
consider the motions that start in the simplex a m at t = and are defined for t > 0. 

For large m, almost all volume of the simplex a m is concentrated in a small neighbor- 
hood of the center of o~ m , near the point c = (—, -jj, . . . , — ) . Hence, one can expect that a 
typical motion of a general dynamical system in a m for sufficiently large m spends almost 
all the time in a small neighborhood of c. 

Indeed, the m- dimensional volume of a m is V m = ^j. The part of a m , where Xi > e, has 
the volume (1 — e) m V m . Hence, the part of cr m , where Xi < e for all % — 1, . . . , m+ 1, has 
the volume V e > {l-{m + l){l-e) m )V m . Note, that (m+l)(l-e) m ~ mexp(-em) — > 0, 
if m ^ oo (1 > £ > 0). Therefore, for m — ► oo, V e = (1 — o(l))V m . The volume PVp of the 
part of <j m with Euclidean distance to the center c less than p > can be estimated as 
follows: W p > V £ for ey/m + 1 = p, hence > (l-(m + l)(l-p/ v / m+ l) m )V m . Finally, 
(m + 1)(1 — p/ \Jm + l) m ~ mexp(— py/m), and PVp = (1 — o(l))K„ for m — > oo. Let us 
mention here the opposite concentration effect for a m-dimensional ball B m : for m — > oo 
the most part of its volume is concentrated in an arbitrary small vicinity of its boundary, 
the sphere. This effect is the essence of the famous equivalence of micro canonical and 
canonical ensembles in statistical physics (for detailed discussion see [29J). 

Let us consider dynamical systems with an additional property ("inheritance"): all 
the faces of the simplex a m are also positively invariant with respect to the systems with 
inheritance. It means that if some Xi = initially at the time t = 0, then x\ = for t > 
for all motions in <j m . The essence of selection theorems is as follows: a typical motion 
of a typical dynamical system with inheritance spends almost all the time in a small 
neighborhood of low-dimensional faces, even if it starts near the center of the simplex. 

Let us denote by d r a m the union of all r-dimensional faces of o m . Due to the selection 
theorems, a typical motion of a typical dynamical system with inheritance spends almost 
all time in a small neighborhood of d r a m with r m. It should not obligatory reside 
near just one face from d r a m , but can travel in neighborhood of different faces from d r a m 
(the drift effect). The minimax estimation of the number of points in tu-limit distributions 
through the diameters e n > of the set conv(/cjvf) is the estimation of r. 

2.6 Decreasing measures of diversity, 

Lyapunov functionals, and Burg Entropy 

The distinguished Lyapunov functionals play important role in kinetics. For physical and 
chemical systems such a functional is, as a rule, the entropy or some of related functionals. 
The standard examples are the free (or Helmholz) energy and the free enthalpy (or Gibbs 
energy). It appears that for system (CD) there exist generically a plenty of Lyapunov 
functionals. They can be considered as the decreasing measures of diversity. These 
functionals are very similar to the entropy, but rather to the Burg entropy [TTJ [30j, and 
not to the classic Boltzmann-Gibbs-Shannon entropy. 

Generically, we can assume that the convex compact set conv(A; M ) does not include 
zero. The set of exclusion from this rule is completely thin. Then there exists a contin- 
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uous functional I on C(X) with positive values on c5W(/cm)- It is a measure on X (by 
definition). Of course, values of p outside suppp(O) do not have any relation to reality, 
and it is sufficient to discuss only the case supp/i(0) = X. Any solution of ([TJ can be 
presented in a form: p{t) = p(t)p(0), where p(t) G C(X), p{t) > at each t, and 



Hence, the "entropy" 
monotonically decrease: 



dQnp(t)) 

^ = ( 32 ) 

S l [p(t)] = -[lMp(t)} (33) 

^1 = -P,W<0- (34) 

In order to avoid the dependence of an initial distributions /^(O) we can restrict the initial 
system ([TJ onto its invariant subspace, the space of measures which have a form p = pp°, 
where p° is a given measure with supp/i° = X, and p G C(X). 

The space L 2 ^ (X) is the completion of C(X) with respect to the norm 

= [/i°,/T /2 - (35) 



It is the Hilbert space. For the scalar product we shall use the notation ( 
The compact convex set conv(&M) is also compact in L 2 JX). The set 

DpQ = G Ll (X) : (<p, /) M o > Ofor any / G conv^)} 

is open in L^ (X) . Generically, it is nonempty, and, hence, there is a dense set of contin- 
uous functions in D^o. For any function g G D^o we can define the related "entropy" 

S[p] = -(hip, #^0 = -\p°,g]np]. (36) 

This entropy is the average of the density logarithm with a weight —g, the set of allowed 
weights depends on reproduction coefficient. The entropy (J3"B]) decrease for each solution 
of ([ID that has a form p{t) = p(t)p° with positive initial condition (p(0) is strictly positive 
function). 

The introduced entropies decrease monotonically to minus infinity. It is clear, all 
measures of diversity, including the classical entropy, should decrease in a result of the 
selection process. The only question was about the monotonicity of this decreasing. 



3 Drift and mutations 
3.1 Drift equations 

So far, we talked about the support of an individual cj-limit distribution. For almost all 
systems it is small. But this does not mean, that the union of these supports is small 
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even for one solution fi(t). It is possible that a solution is a finite set of narrow peaks 
getting in time more and more narrow, moving slower and slower, but not tending to fixed 
positions, rather continuing to move along its trajectory, and the path covered tends to 
infinity as t — > oo. 

This effect was not discovered for a long time because the slowing down of the peaks 
was thought as their tendency to fixed positions For the best of our knowledge, the first 
detailed publication of the drift equations and corresponded types of stability appeared 
in book [28J, first examples of coevolution drift on a line were published in the series of 
papers [69J. 

There are other difficulties related to the typical properties of continuous functions, 
which are not typical for the smooth ones. Let us illustrate them for the distributions 
over a straight line segment. Add to the reproduction coefficients the sum of small 
and narrow peaks located on a straight line distant from each other much more than 
the peak width (although it is e-small). However small is chosen the peak's height, one 
can choose their width and frequency on the straight line in such a way that from any 
initial distribution fi whose support is the whole segment, at t — > oo we obtain cu-limit 
distributions, concentrated at the points of maximum of the added peaks. 

Such a model perturbation is small in the space of continuous functions. Therefore, it 
can be put as follows: by small continuous perturbation the limit behavior of system (T7]j 
can be reduced onto a e-netfor sufficiently small e. But this can not be done with the small 
smooth perturbations (with small values of the first and the second derivatives) in the 
general case. The discreteness of the net, onto which the limit behavior is reduced by small 
continuous perturbations, differs from the discreteness of the support of the individual 
cj-limit distribution. For an individual distribution the number of points is estimated, 
roughly speaking, by the number of essential parameters (TO)]) , while for the conjunction 
of limit supports - by the number of stages in approximation of k^ by piece-wise constant 
functions. 

Thus, in a typical case the dynamics of systems ([1]) with smooth reproduction coeffi- 
cients transforms a smooth initial distributions into the ensemble of narrow peaks. The 
peaks become more narrow, their motion slows down, but not always they tend to fixed 
positions. 

The equations of motion for these peaks can be obtained in the following way [25] . 
Let X be a domain in the n-dimensional real space, and the initial distributions no be 
assumed to have smooth density. Then, after sufficiently large time t, the position of 
distribution peaks are the points of the average reproduction coefficient maximum {k^) t 
f llip to any accuracy set in advance. Let these points of maximum be x a , and 




x=ar 



It is easy to derive the following differential relations 




X=X' 
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dt 



d 2 k 



At) 



dxidxj 



(37) 



The exponent coefficients remain time dependent even when the distribution tends to 
a 5-function. It means (in this case) that peaks became infinitely narrow. Nevertheless, 
it is possible to change variables and represent the weak* tendency to stationary discrete 
distribution as usual tendency to a fixed points, see (l4"Tj) below. 

These relations (!37|) do not form a closed system of equations, because the right- 
hand parts are not functions of xf and qfy For sufficiently narrow peaks there should 
be separation of the relaxation times between the dynamics on the support and the 
dynamics of the support: the relaxation of peak amplitudes (it can be approximated by 
the relaxation of the distribution with the finite support, {x a }) should be significantly 
faster than the motion of the locations of the peaks, the dynamics of {x a }. Let us write 
the first term of the corresponding asymptotics [28] . 

For the finite support {x a } the distribution is fi = 'Yl ia N a 5{x — x a ). Dynamics of the 
finite number of variables, N a obeys the system of ordinary differential equations 



dN a 
dt 



k a (N)N a 



(3* 



where AT" is vector with components N a , k a (N) is the value of the reproduction coefficient 
k^ at the point x a : 

k a (N) = k^(x a ) for /x = N <* 5 ( x ~ x ")- 



Let the dynamics of the system (|38|) for a given set of initial conditions be simple: 
the motion N(t) goes to the stable fixed point N = N*({x a }). Then we can take in the 
right hand side of ([3] 



(39) 



Because of the time separation we can assume that (i) relaxation of the amplitudes of 
peaks is completed and (ii) peaks are sufficiently narrow, hence, the difference between 
true fc M (t) and the reproduction coefficient for the measure (139!) with the finite support 
{x a } is negligible. Let us use the notation k*({x a })(x) for this reproduction coefficient. 
The relations (1371) transform into the ordinary differential equations 



EC- 



dx" 
~dT 

dt 



dk*{{x p }){x) 



dxi 

d 2 k*{{xP})(x 



dxidxj 



(40) 



For many purposes it may be useful to switch to the logarithmic time r = In t and to new 
variables 



dxidxj 
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For large t we obtain from (l4Tfl) 



3 





dr 



dk*{{xP}){x) 
_d 2 k*{{x a }){x) 



X=X' 



X=X> 




(41) 



The way of constructing the drift equations ( H0|4Ti) for a specific system (OQ) is as follows: 
1. For finite sets {x a } one studies systems fl38l) and finds the equilibrium solutions 



2. For given measures fi*({x a (t)}) fl39|) one calculates the reproduction coefficients 
k^x) = k*({x a })(x) and first derivatives of these functions in x at points x a . That 
is all, the drift equations (140114 ip are set up. 

The drift equations (I40|4ip describe the dynamics of the peaks positions x a and of the 
coefficients qf,. For given x a , and N* the distribution density /i can be approximated 
as the sum of narrow Gaussian peaks: 



where Q a is the inverse covariance matrix {qf 3 ). 

If the limit dynamics of the system (|38|) for finite supports at t — > oo can be described 
by a more complicated attractor, then instead of reproduction coefficient k*({x a })(x) = 
k^* for the stationary measures /i* (|39|) one can use the average reproduction coefficient 
with respect to the corresponding Sinai-Ruelle-Bowen measure [321 SB] • If finite systems 
(I3"5j) have several attractors for given {x a }, then the dependence k*({x a }) is multi- valued, 
and there may be bifurcations and hysteresis with the function k*({x a }) transition from 
one sheet to another. There are many interesting effects concerning peaks' birth, desin- 
tegration, divergence, and death, and the drift equations (1401141 j) describe the motion in 
a non-critical domain, between these critical effects. 

Inheritance (conservation of support) is never absolutely exact. Small variations, mu- 
tations, immigration in biological systems are very important. Excitation of new degrees 
of freedom, modes diffusion, noise are present in physical systems. How does small pertur- 
bation in the inheritance affect the effects of selection? The answer is usually as follows: 
there is such a value of perturbation of the right-hand side of at which they would 
change nearly nothing, just the limit ^-shaped peaks transform into sufficiently narrow 
peaks, and zero limit of the velocity of their drift at t — > oo substitutes by a small finite 
one. 



N*({x a }); 




(42) 
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3.2 Drift in presence of mutations and scaling invariance 

The simplest model for "inheritance + small variability" is given by a perturbation of (pQ) 
with diffusion term 

^jM = k ^ t) x ^t)+sJ2d^)^0- (43) 

ij % 3 

where e > and the matrix of diffusion coefficients dij is symmetric and positively definite. 

There are almost always no qualitative changes in the asymptotic behavior, if e is 
sufficiently small. With this the asymptotics is again described by the drift equations 
(I40|I41I) . modified by taking into account the diffusion as follows: 



dxl dk*({ X P})(x) 

'' J dt ' 



d$ d 2 k*({x?})(x) 



dt dxidxj 



-2eJ2& kl (x a )q? j . (44) 



kl 



Now, as distinct from (HUj) . the eigenvalues of the matrices Q a = (g£j) cannot grow 
infinitely. This is prevented by the quadratic terms in the right-hand side of the second 
equation (j44]). 

Dynamics of (JI41) does not depend on the value e > qualitatively, because of the 
obvious scaling property. If e is multiplied by a positive number u, then, upon rescalling 
t 1 = u~ l / 2 t and q"j' = v~ x ^ 2 q^p we have the same system again. Multiplying e > by 
v > changes only peak's velocity values by a factor u 1 ^ 2 , and their width by a factor 
z/ 1 / 4 . The paths of peaks' motion do not change at this for the drift approximation 
(but the applicability of this approximation may, of course, change). 



3.3 Three main types of stability 

Stable steady-state solutions of equations of the form may be only the sums of 5- 
functions - this was already mentioned. There is a set of specific conditions of stability, 
determined by the form of equations. 

Consider a stationary distribution for ([T|) with a finite support 

H*{x) = ^N*J(x-x* a ). 

a 

Steady state of \i* means, that 

k^(x* a ) =0 for alia. (45) 

The internal stability means, that this distribution is stable with respect to perturba- 
tions not increasing the support of fi*. That is, the vector N* is the stable fixed point 
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for the dynamical system (I38I) . Here, as usual, it is possible to distinguish between the 
Lyapunov stability, the asymptotic stability and the first approximation stability (nega- 
tiveness of real parts for the eigenvalues of the matrix dN*/dN* at the stationary points). 

The external stability (uninvadability) means stability to an expansion of the support, 
i.e. to adding to /i* of a small distribution whose support contains points not belonging 
to supp/x*. It makes sense to speak about the external stability only if there is internal 
stability. In this case it is sufficient to restrict ourselves with ^-functional perturbations. 
The external stability has a very transparent physical and biological sense. It is stabil- 
ity with respect to introduction into the systems of a new inherited unit (gene, variety, 
specie...) in a small amount. 

The necessary condition for the external stability is: the points {x* a } are points of 
the global maximum of the reproduction coefficient k fl *(x). It can be formulated as the 
optimality principle 

k^(x) < for all x; k^{x* a ) = 0. (46) 

The sufficient condition for the external stability is: the points {x* a } and only these 
points are points of the global maximum of the reproduction coefficient k fJL *{x* a ). At the 
same time it is the condition of the external stability in the first approximation and the 
optimality principle 

k fl .(x) < for x £ {x* a }; k^{x* a ) = 0. (47) 

The only difference from (I46p is the change of the inequality sign from k^*{x) < to 
k ll *{x) < for x ^ {x* a }. The necessary condition fj4~6"]) means, that the small ^-functional 
addition will not grow in the first approximation. According to the sufficient condition 
f|4T|) such a small addition will exponentially decrease. 

If A is a finite set, then the combination of the external and the internal stability is 
equivalent to the standard stability for a system of ordinary differential equations. 

For the continuous A there is one more kind of stability important from the appli- 
cations viewpoint. Substitute 5-shaped peaks at the points {x* a } by narrow Gaussians 
and shift slightly the positions of their maxima away from the points x* a . How will the 
distribution from such initial conditions evolve? If it tends to fi without getting too dis- 
tant from this steady state distribution, then we can say that the third type of stability - 
stable realizability - takes place. It is worth mentioning that the perturbation of this type 
is only weakly* small, in contrast to perturbations considered in the theory of internal 
and external stability. Those perturbations are small by their norms. Let us remind that 
the norm of the measure \i is = supm <:L [//, /]. If one shifts the 5-measure of unite 
mass by any nonzero distance e, then the norm of the perturbation is 2. Nevertheless, 
this perturbation weakly* tends to with e —>■ 0. 

In order to formalize the condition of stable realizability it is convenient to use the drift 
equations in the form (I4ip . Let the distribution /i* be internally and externally stable in 
the first approximations. Let the points x* a of global maxima of k^* (x) be non-degenerate 
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in the second approximation. This means that the matrices 




dxidxj 



) 



x=x 



(48) 



are strictly positively definite for all a. 

Under these conditions of stability and non-degeneracy the coefficients of (|4ip can be 
easily calculated using Taylor series expansion in powers of (x a —x* a ). The stable realiz- 
ability of fi* in the first approximation means that the fixed point of the drift equations 
(j4Tj) with the coordinates 



is stable in the first approximation. It is the usual stability for the system fT4Tl) of ordinary 
differential equations, and these conditions with the notion of the stable realizability 
became clear from the logarithmic time drift equations (1411) directly. 

To explain the sense of the stable realizability we used in the book [21] the idea of the 
"Gardens of Eden" from J.H.Conway "Game of Life" [21]. That are Game of Life patterns 
which have no father patterns and therefore can occur only at generation 0, from the very 
beginning. It is not known if a pattern which has a father pattern, but no grandfather 
pattern exists. It is the same situation, as for internal and external stable (uninvadable) 
state which is not stable realizable: it cannot be destroyed by mutants invasion and by 
the small variation of conditions, but, at the same time, it is not attractive for drift, and, 
hence, can not be realized in this asymptotic motion. It can be only created. 

The idea of drift and the corresponding stability notions become necessary in any ap- 
proach to evolutionary dynamics on continuous paces. In recent paper [2T] , the asymptotic 
stability under the replicator dynamics over a continuum of pure strategies was studied. 
It was shown in [2T] that strong uninvadability of a pure strategy x* [I] is insufficient for 
its stability with respect to the drift: It does not imply convergence to x* when starting 
from a distribution of small deviations from x*, regardless of how small these deviations 
are. The standard idea of asymptotic stability is: "after small deviation the system re- 
turns to the initial regime, and do not deviate to much on the way of returning". The 
crucial question for the measure dynamics is: in which topology the deviation is small? 
The small shift of the narrow peak of distribution in the continuous space of strategies 
can be considered as a small deviation in the weak* topology, but it is definitely large 
deviation in the strong topology, for example, if the shift is not small in comparison with 
the peak with. In the papers [SH E2] the idea of drift equations appeared again for the 
gaussian peaks in the dynamics of continuous symmetric evolutionary games. The au- 
thors [61, 62J introduced the idea of "evolutionary robustness" (realizability) and claimed 
the necessity of the additional notion of stability very energetically: "Furthermore, we 
provide new conditions for the stability of rest points and show that even strict equilibria 
may be unstable" . 





(49) 
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4 Example: Cell division self-synchronization 



The results described above admit for a whole family of generalizations. In particular, it 
seems to be important to extend the theorems of selection to the case of vector distribu- 
tions, when kfj,(x) is a linear operator at each /i, x. In this case, in the optimality principles 
for steady-states distributions the maximal eigenvalues of these operators k^{x) appear 
instead of the values of the reproduction coefficients. For general cu-limit sets special mul- 
tiplicative operator averages are in use [28J. It is possible also to make generalizations for 
some classes of non-autonomous equations with explicit dependencies of k^(x) on t [28J. 

Availability of such a network of generalizations allows to construct the reasoning as 
follows: what is inherited (i.e. for what the law of conservation of support holds) is the 
subject of selection (i.e. with respect to these variables at t — ► oo the distribution becomes 
discrete and the limit support can be described by the optimality principles). 

This section gives a somewhat unconventional example of inheritance and selection, 
when the reproduction coefficients are subject additional conditions of symmetry. 

Consider a culture of microorganisms in a certain medium (for example, pathogenous 
microbes in the organism of a host). Assume, for simplicity, the following: let the time 
period spent by these microorganisms for the whole life cycle be identical. 

At the end of the life cycle the microorganism disappears and new several microorgan- 
isms appear in the initial phase. Let T be the time of the life cycle. Each microorganism 
holds the value of the inherited variable, it is "the moment of its appearance (modT)". 
Indeed, if the given microorganism emerges at time r (0 < r < T), then its first descen- 
dants appear at time T + r, the next generation - at the moment 2T + r, then 3T + r 
and so on. 

It is natural to assume that the phase r (modT) is the inherited variable with some 
accuracy. This implies selection of phases and, therefore, survival of their discrete number 
Ti, . . .r m , only. But results of the preceding sections cannot be applied directly to this 
problem. The reason is the additional symmetry of the system with respect to the phase 
shift. But the typicalness of selection and the instability of the uniform distribution over 
the phases r (modT) can be shown for this case, too. Let us demonstrate it with the 
simplest model. 

Let the difference between the microorganisms at each time moment be related to 
the difference in the development phases only. Let us also assume that the state of the 
medium can be considered as a function of the distribution fi(r) of microorganisms over 
the phases r G]0,T] (the quasi-steady state approximation for the medium). Consider 
the system at discrete times nT and assume the coefficient connecting /i at moments nT 
and nT + T to be the exponent of the linear integral operator value: 

yU n+ i(r) = /i n (r)exp 

Here, jtf n (r) is the distribution at the moment nT, k = const, k\{r) is a periodic function 
of period T. 



hir-r')^') dr' 



(50) 
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This model is constructed in order to study the interaction of two factors of microbial 
dynamics: the fixed period of the cell cycle, and the density-dependent interaction with 
the medium. The density-dependent interactions is modeled in the same degree of gener- 
ality, as in general Volterra equations, with one addition: for systems with discrete time 
the exponential form of reproduction coefficient is more natural and useful, it was shown 
by Ricker [SS]. In this sense, (I5UI) presents a hybrid Volterra-Ricker model for microbial 
population with the fixed period of the cell cycle. The deviation from the fixed period 
can be formalized as a phase diffusion, as it was presented for the general systems with 
inheritance in the previous section, and here we study the "pure" consequences of phase 
inheritance. 

This model significantly differs from the continuous time models, where the cell split- 
ting is presented as a "quasi-chemical process" of fission of a cell with size parameter 
2x into two identical daughters with parameter x: [2x] — > [x] + [x] (any cell with size 
parameter x can spontaneously split at any time without any dependence on its history). 
The probability of splitting depends here on the cell size only. 

For example, a linear model for the growth of such a size-structured cell population, 
reproducing by continuous "Markov" fission of cells is formulated and identified in [18]. 
With known functions a(x) (death), g(x) (growth) and b(x) ("loss due to splitting") the 
model takes the form of a first order partial differential equation, 

8n{t ' X) + d ^ 9n ^ x ^ = - an (t, x) - bn(t, x) + 4b(2x)n(t, 2x), (51) 
ot ox 

for the density n(x,t). The equation is accompanied by proper initial and boundary 
conditions. With several restrictions it is proved that the asymptotic behavior of solutions 
for t — > oo has the form n(t,x) = ce Xdt (n(x) + o(l)), where the constant c (single in the 
expression) depends on the initial distribution. The way to establish the sign of and 
the explicit form of the function n(x) is indicated. The continuous time Markov property 
(independence of history) with continuous kinetic coefficient b(x) implies smoothing of 
limit distributions. In model (1501) the cell remembers the moment of its "birth" and 
splits exactly after living time T. This property is the main difference between (|50|) and 
f lBTj) . The second difference is nonlinearity of fl50l) . we take into account the cells density- 
dependent interaction (mediated by the medium state). The coefficient of this interaction, 
k\{r — t'), depends on the age difference of interacting cells. This nonlinearity in floUl) 
includes in implicit form the structure of population with age-determined size difference, 
etc. 

The uniform steady-state /i* = n* = const for (1501) is: 

n* = . (52) 

Joh(6)de 

In order to examine stability of the uniform steady state /i* (|52|) . the system (I50I) is 
linearized. For small deviations A/i(r) in linear approximation 

A/i n+1 (r) = A// n (r) -n* I h(r - r')A/i n (r') dr'. (53) 
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Expand k%(9) into the Fourier series: 

h(9) = b + ^2 {^n sin ^27rn^J + b 
Denote by A operator of the right-hand side of (1531) . In the basis of functions 



cos ( 27m- ) ) . (54) 



e sn = sin \ 2ixn—j , e cn = cos I 2im— 

on the segment ]0,T] the operator A is block-diagonal. The vector e$ is eigenvector, 
Ae = A e , A = 1 — n*b T. On the two-dimensional space, generated by vectors e sn , e cn 
the operator A is acting as a matrix 

A n — Tn* i Tn* l l°°J 



\ 2 n 2 n / 

The corresponding eigenvalues are 

Tn* 

A n 1,2 = 1 2~(&n ± ( 56 ^ 

For the uniform steady state /x* (J52|) to be unstable it is sufficient that the absolute 
value of at least one eigenvalue A„i j2 be larger than 1: |A n i j2 | > 1. If there is at least one 
negative Fourier cosine-coefficient b n < 0, then ReA n > 1, and thus |A„| > 1. 

Note now, that almost all periodic functions (continuous, smooth, analytical - this does 
not matter) have negative Fourier cosine-coefficient. This can be understood as follows. 
The sequence b n tends to zero at n — > oo. Therefore, if all b n > 0, then, by changing b n at 
sufficiently large n, we can make b n negative, and the perturbation value can be chosen 
less than any previously set positive number. On the other hand, if some b n < 0, then this 
coefficient cannot be made non-negative by sufficiently small perturbations. Moreover, 
the set of functions that have all Fourier cosine-coefficient non-negative is completely 
thin, because for any compact of functions K (for most of norms in use) the sequence 
B n = meiXfcK \b n (f)\ tends to zero, where b n (f) is the nth Fourier cosine-coefficient of 
function /. 

The model (1501) is revealing, because for it we can trace the dynamics over large times, 
if we restrict ourselves with a finite segment of the Fourier series for k\{9). Describe it for 

k\{9) = b + a sin ( 27r ^ +bcos ( 27r ^ • ( 57 ) 

Assume further that b < (then the homogeneous distribution /j* = ^ is unstable) and 
b > y/a 2 + b 2 (then the J /x(r) dr cannot grow unbounded in time). Introduce notations 

M (/i) = n(r)dr, MM = J cos (2n^ /i(r) dr, 

sin fan?-) ii(t) dr, (//)„ = - /i m , (58) 

71 m=0 



36 



where // m is the distribution // at the discrete time m. 
In these notations, 



Hn+l{T) = fJ-n[T)exp 



k - b M (fi n ) - (aM c (/i n ) + 6M s (/i n )) sin ( 2ir 



+(aM s (/i n ) - bM c (fM n )) cos 2tt 



(59) 



Represent the distribution ix n {r) through the initial distribution Hq{t) and the functional 
M , M c , M s values for the average distribution (/i)„): 



x exp |n 

+ (aM a ((/x) n ) -6M C ( {//)„)) (-us (2tt- 



fc - b M ((li)n) ~ (aM c ((/i)„) + bM s ({fj) n )) sin ( 2tt- 



r 



(60) 



The exponent in (!60|) is either independent of r, or there is a function with the single 
maximum on ]0, T\. The coordinate r* of this maximum is easily calculated 



r, 



# 



T aM c ((/i)„) + 6M s ((//) 7 

cirCtcLll — — ; — ■ — — ; — 

2tt aM B ((fi) n ) - bM c ((fi), 



(61) 



Let the non-uniform smooth initial distribution /i has the whole segment ]0,T] as its 
support. At the time progress the distributions fi n (r) takes the shape of ever narrowing 
peak. With high accuracy at large a we can approximate n n (j) by the Gaussian distri- 
bution (approximation accuracy is understood in the weak* sense, as closeness of mean 
values) : 



Mn(r) « M ^exp[-g n (r - r#)% M = ^ 



(0) 6 + 6' 



(62) 



2 2 , 



2tt 
T 



x [(aM c ((^) n ) + bM s ((fi) n )) 2 + (al s ((/i)„) - 6M c ((/i)„)) 2 ] . 

Expression (1621) involves the average measure which is difficult to compute. However, 
we can operate without direct computation of (fj) n - At q n ^> ^ we can compute g n+ i 
and t* +1 : 



fJ"n+l 



Mr 
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exp [-(q n + Aq)((r - r* - At*) 2 } , 



Ag » — 6M — , Ar # » -M — . 



(63) 



The accuracy of these expression grows with time n. The value q n grows at large n 
almost linearly, and rjf, respectively, as the sum of the harmonic series (modT), i.e. as 
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Inn (modT). The drift effect takes place: location of the peak r#, passes at n — > oo the 
distance diverging as Inn. 

Of interest is the case, when 6 > but 

With this, homogeneous distribution /x* = n* is not stable but /i does not tend to 5- 
functions. There are smooth stable "self-synchronization waves" of the form 



H n = 7 exp 



7 cos { (t — nAr*) — 



At small 6 > (6 -C |a|, 6Mo <C a 2 ) we can find explicit form of approximated expressions 
for q and At*: 

« 2 M # bT 
q « — — , Ar# « — . (64) 
26 7ra 

At 6 > 0, 6 — > 0, smooth self-synchronization waves become ever narrowing peaks, and 
their steady velocity approaches zero. If b = 0, |Ai| 2 > 1, then the effect of selection takes 
place again, and for almost all initial conditions /Jo with the support being the whole 
segment ]0,T] the distribution \x n takes at large n the form of a slowly drifting almost 
Gaussian peak. It becomes narrower with the time, and the motion slows down. Instead 
of the linear growth of q n which takes place at b < (1631) . for 6 = 0, q n+ i — q n ~ constg" 1 
and q n grows as const y/n. 

The parametric portrait of the system for the simple reproduction coefficient (1571) is 
presented in Fig. [TJ 

As it is usual, a small desynchronization transforms ^-functional limit peaks to narrow 
Gaussian peaks, and the velocity of peaks tends to small but nonzero velocity instead of 
zero. The systems with small desynchronization can be described by equations of the form 
( 1441) . The large desynchronization can completely destroy the effects of phase selection 
and, for example, it might lead to the globally stable uniform phase distribution. 

There are many specific mechanisms of synchronization and desynchronysation in 
physics and biology (see, for example, [3l [63j HQ [60j [13]). We described here very simple, 
but universal mechanism: it requires only that the time of the life cycle is fixed, in this 
case in a generic situation we should observe the self-synchronization. Of course, the 
real-world situation can be much more complicated, with a plenty of additional factors, 
but the basic mechanism of the "phase selection" works always, if the life cycle has more 
or less fixed duration. 



Conclusion: Main results about systems with in- 
heritance 

1. If a kinetic equation has the quasi-biological form (TO , then it has a rich system of 
invariant manifolds: for any closed subset A C X the set of distributions Ma — {a* I 
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A 



V 



n*bT/2 




Stable waves 
with non-zero 
velocity 



v=jb/ (Tta) 



n*aT/2 



Waves with velocity 



V +l/n ->0 
n 



Figure 1: The simplest model of cell division self-synchronization: The parametric por- 
trait. 

supp/i C A} is invariant with respect to the system ([T]). These invariant manifolds 
form important algebraic structure, the summation of manifolds is possible: 



(Of course, M AnB = M A D Mb)- 

2. Typically, all the cu-limit points belong to invariant manifolds Ma with finite A 
(from the application point of view there is no difference between finite and almost 
finite sets). The finite-dimensional approximations of the reproduction coefficient 
( TT5]) provides the minimax estimation of the number of points in A. 

3. Typically, systems with inheritance have a rich family of Lyapunov functionals of 
form (1361) similar to the Burg entropy, these functionals can be interpreted as the 
measures of decreasing diversity. 

4. For systems with inheritance ([1]) a solution typically tends to be a finite set of 
narrow peaks getting in time more and more narrow, moving slower and slower. 
It is possible that these peaks do not tend to fixed positions, rather they continue 
moving, and the path covered tends to infinity at t —>■ oo. This is the drift effect. 

5. The equations for peak dynamics, the drift equations, (I40f41ll44l) describe dynamics 
of the shapes of the peaks and their positions. For systems with small variability 
("mutations") the drift equations (14"4"|) has the scaling property: the change of the 
intensity of mutations is equivalent to the change of the time scale. 



M A ®M B = M AUB . 
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6. Three specific types of stability are important for the systems with inheritance: 
internal stability (stability with respect to perturbations without extension of dis- 
tribution support), external stability (stability with respect to small one-point ex- 
tension of distribution support), and stable realizability (stability with respect to 
weakly* small perturbations: small extensions and small shifts of the peaks; these 
perturbations are small in the weak* topology). 

The cell division self-synchronization demonstrates effects of unusual inherited unit, it 
is an example of a "phase selection" . One specific property of this selection is additional 
symmetry with respect to phase shift. In this case, the general results about selection 
cannot be used directly. Nevertheless, the "equivariant" selection theory successfully 
works too. 

Some exact results of the mathematical selection theory can be found in [501 [51]. There 
exist many physical examples of systems with inheritance [851 ESI [531 12SI HS1 ESI [73]. A 
wide field of ecological applications was described in the book [72]. An introduction into 
adaptive dynamics was given in notes [32] that illustrate largely by way of examples, how 
standard ecological models can be put into an evolutionary perspective in order to gain 
insight in the role of natural selection in shaping life history characteristics. 

Acknowledgement. Author is grateful to V. Okhonin, who involved him many 
years ago in the analysis of mathematical models of natural selection. M. Gromov kindly 
allowed to quote his talk [37j . 
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